19 research outputs found
Recommended from our members
Crisis Event Extraction Service (CREES) - Automatic Detection and Classification of Crisis-related Content on Social Media
Social media posts tend to provide valuable reports during crises. However, this information can be hidden in large amounts of unrelated documents. Providing tools that automatically identify relevant posts, event types (e.g., hurricane, floods, etc.) and information categories (e.g., reports on affected individuals, donations and volunteering, etc.) in social media posts is vital for their efficient handling and consumption. We introduce the Crisis Event Extraction Service (CREES), an open-source web API that automatically classifies posts during crisis situations. The API provides annotations for crisis-related documents, event types and information categories through an easily deployable and accessible web API that can be integrated into multiple platform and tools. The annotation service is backed by Convolutional Neural Networks (CNNs) and validated against traditional machine learning models. Results show that the CNN-based API results can be relied upon when dealing with specific crises with the benefits associated with the usage word embeddings
Recommended from our members
Community and Thread Methods for Identifying Best Answers in Online Question Answering Communities
Much research has recently investigated the measurement of quality answers in Question Answering (Q&A) communities in the form of automatic best answer identification. Previous approaches have focused on manual user annotations and diverse features based on intuition for identifying best answers and proved relatively successful despite considering best answer identification as a general classification problem.
Best answer modelling is generally distanced from community studies about what users regard as important for identifying quality content. In particular, previous research tends to only focus on the automatic aspects of best answers identification model by applying generic learning algorithms.
This thesis introduces the concepts of qualitative and structural design in order to investigate if features derived from community questionnaires can enrich the understanding of best answer identification in Q&A communities and if the thread-like structure of Q&A communities can be exploited for better results. Two different approaches for exploiting the thread structure of Q&A communities are proposed and two new, previously unstudied, features are introduced. First, a measure of question complexity is introduced as a proxy measure of answerer knowledge. Second, different models of contribution effort are proposed for representing the answering reactivity of contributors.
The experiments are systematically conducted on datasets issued from three different communities that vary in size, content and structure. The results show that the newly proposed features allow for better understanding of what constitute best answers. The findings also reveal that the thread-wise algorithms and optimisation techniques created from the structural design methodology correlate with best answers. In general both structural and qualitative design appear to improve best answer identification meaning that structural and qualitative methods may improve unrelated classification tasks
Automatic identification of best answers in online enquiry communities
Online communities are prime sources of information. The Web is rich with forums and Question Answering (Q&A) communities where people go to seek answers to all kinds of questions. Most systems employ manual answer-rating procedures to encourage people to provide quality answers and to help users locate the best answers in a given thread. However, in the datasets we collected from three online communities, we found that half their threads lacked best answer markings. This stresses the need for methods to assess the quality of available answers to: 1) provide automated ratings to fill in for, or support, manually assigned ones, and; 2) to assist users when browsing such answers by filtering in potential best answers. In this paper, we collected data from three online communities and converted it to RDF based on the SIOC ontology. We then explored an approach for predicting best answers using a combination of content, user, and thread features. We show how the influence of such features on predicting best answers differs across communities. Further we demonstrate how certain features unique to some of our community systems can boost predictability of best answers
Modelling Question Selection Behaviour in Online Communities
Value of online Question Answering (Q&A) communities is driven by the question-answering behaviour of its members. Finding the questions that members are willing to answer is therefore vital to the efficient operation of such communities. In this paper, we aim to identify the parameters that cor- relate with such behaviours. We train different models and construct effective predictions using various user, question and thread feature sets. We show that answering behaviour can be predicted with a high level of success
Recommended from our members
On Semantics and Deep Learning for Event Detection in Crisis Situations
In this paper, we introduce Dual-CNN, a semantically-enhanced deep learning model to target the problem of event detection in crisis situations from social media data. A layer of semantics is added to a traditional Convolutional Neural Network (CNN) model to capture the contextual information that is generally scarce in short, ill-formed social media messages. Our results show that our methods are able to successfully identify the existence of events, and event types (hurricane, floods, etc.) accurately (> 79% F-measure), but the performance of the model significantly drops (61% F-measure) when identifying fine-grained event-related information (affected individuals, damaged infrastructures, etc.). These results are competitive with more traditional Machine Learning models, such as SVM
Predicting Answering Behaviour in Online Question Answering Communities
The value of Question Answering (Q&A) communities is de- pendent on members of the community finding the questions they are most willing and able to answer. This can be diffi- cult in communities with a high volume of questions. Much previous has work attempted to address this problem by recommending questions similar to those already answered. However, this approach disregards the question selection behaviour of the answers and how it is affected by factors such as question recency and reputation. In this paper, we identify the parameters that correlate with such a behaviour by analysing the users’ answering patterns in a Q&A com- munity. We then generate a model to predict which question a user is most likely to answer next. We train Learning to Rank (LTR) models to predict question selections using various user, question and thread feature sets. We show that answering behaviour can be predicted with a high level of success, and highlight the particular features that influence users’ question selections
Recommended from our members
Co-Spread of Misinformation and Fact-Checking Content during the Covid-19 Pandemic
In the context of the Covid-19 pandemic, the consequences of misinformation are a matter of life and death. Correcting misconceptions and false beliefs are important for injecting reliable information about the outbreak. Fact-checking organisations produce content with the aim of reducing misinformation spread, but our knowledge of its impact on misinformation is limited. In this paper, we explore the relation between misinformation and fact-checking spread during the Covid-19 pandemic. We specifically follow misinformation and fact-checks emerging from December 2019 to early May 2020. Through a combination of spread variance analysis, impulse response modelling and causal analysis, we show similarities in how misinformation and fact-checking information spread and that fact-checking information has a positive impact in reducing misinformation. However, we observe that its efficacy can be reduced, due to the general amount of online misinformation and the short-term spread of fact-checking information compared to misinformation
Recommended from our members
On the Readability of Misinformation in Comparison to the Truth
Psychological studies have demonstrated that much misinformation circulating on the Web tends to be more believable and memorable due to its ease of processing. The readability of a passage is a crucial factor in the ease of processing, as it indicates how easy or difficult it is to read and understand. According to some qualitative research, if online misinformation is easier to read, it becomes stickier and more memorable. In contrast, other studies showed that people are more likely to trust and believe misinformation when it appears to be more complex. As a result of such conflicting findings, it remains unclear how readability is associated with true or false content on the Web in general. This paper aims to gain a deeper understanding of readability through quantitative analysis by applying six readability formulas to four datasets containing both true and false content, as well as across multiple datasets. Our research shows that false claims are generally harder to read than true claims
Recommended from our members
Analysing engagement towards the 2014 Earth Hour Campaign in Twitter
Earth Hour (EH) is a large-scale campaign launched by the World Wide Fund For Nature (WWF) every year to raise awareness about environmental issues. Although the EH campaign is active on social media, there is currently no systematic way of assessing its impact on public engagement and the topics they post about. In this paper we study engagement towards the 2014 EH campaign on Twitter. By analysing more than 35K tweets around the campaign we observed that longer posts, easier to read and with positive sentiment generated higher attention levels. Conversations were driven by the main themes of the campaign (super hero, the panda, etc.), but engagement towards these themes did not always translate in engagement towards environmental issues. Users decreased their engagement towards the topics of the campaign after it finished, but these topics still remained in their conversations one month later
Recommended from our members
Understanding Misogynoir: A Study of Annotators’ Perspectives
"Misogynoir" is the anti-Black racist misogyny experienced by Black women, which is characterised by components of both racism and sexism. Misogynoir is challenging to detect due to its inherent subjectivity and its intersectional nature, and people's opinions and interpretations of such hate might vary, which adds to the challenges of understanding it. In this paper, we explored how and some potential why's different annotator characteristics influence how they interpret and annotate a dataset for potential cases of Misogynoir and Allyship. We sampled tweets containing public responses to self-reported misogynoir cases by four prominent Black women in technology, designed an online annotation task study, and recruited annotators of diverse ethnicities and genders from the Prolific crowdsourcing platform. We found that participants' sources of evidence in judging and interpreting content for potential cases of Misogynoir and Allyship, even in circumstances where they all agree on a prospective label, vary across different factors, such as different ethnicity, lived experiences and gender. In addition, we present a variety of plausible interpretations influenced by the various annotators' characteristics. This study demonstrates the relevance of different annotator perspectives and content comprehension in hate speech and the need for further efforts to understand intersectional hate better